A New Address-Free Memory Hierarchy Layer for Zero-Cycle Load

نویسندگان

  • Lu Peng
  • Jih-Kwon Peir
  • Konrad Lai
چکیده

Data communications between producer instructions and consumer instructions through memory incur extra delays that degrade processor performance. In this paper, we introduce a new storage media with a novel addressing mechanism to avoid address calculations. Instead of a memory address, each load and store is assigned a signature for accessing the new storage. A signature consists of the color of the base register along with its displacement value. To represent distinct register content, a unique color is assigned to a register whenever the register is updated. When two memory instructions have the same signature, they address to the same memory location. This memory signature can be formed early in the processor pipeline. For fast data communication, a small Signature Buffer, addressed by the memory signature, can be established to permit stores and loads bypassing normal memory hierarchy. Performance evaluations based on an Alpha 21264-like pipeline using SPEC2000 benchmarks show that an IPC (Instruction-PerCycle) improvement of 12-17% is possible using a small 8-entry signature buffer.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Address-free memory access based on program syntax correlation of loads and stores

An increasing cache latency in next-generation processors incurs profound performance impacts in spite of advanced out-of-order execution techniques. One way to circumvent this cache latency problem is to predict load values at the onset of pipeline execution by exploiting either the load value locality or the address correlation of stores and loads. In this paper, we describe a new load value ...

متن کامل

Knapsack: A Zero-Cycle Memory Hierarchy Component

The widening gap between processors and memory necessitates the development of novel memory hierarchies: hierarchies that can possibly service memory references at register speeds, since service at cache speeds may not be adequate. We consider the design of a novel memory hierarchy component, a knapsack, whose purpose is to provide (very) fast access to frequently-used data objects. Software al...

متن کامل

A Multi-Core Pipelined Architecture for Parallel Computing

Parallel programming on multi-core processors has become the industry’s biggest software challenge. This paper proposes a novel parallel architecture for executing sequential programs using multi-core pipelining based on program slicing by a new memory/cache dynamic management technology. The new architecture is very suitable for processing large geospatial data in parallel without parallel pro...

متن کامل

Hardware and Software Mechanisms for Reducing Load Latency

As processor demands quickly outpace memory, the performance of load instructions becomes an increasingly critical component to good system performance. This thesis contributes four novel load latency reduction techniques, each targeting a di erent component of load latency: address calculation, data cache access, address translation, and data cache misses. The contributed techniques are as fol...

متن کامل

Comparing Multiported Cache Schemes

The performance of the data memory hierarchy is extremely important in current and near future high performance superscalar microprocessors. To address the memory gap, computer designers implement caches to reduce the high memory latencies that are observed in the processor. Due to the ever increasing instruction window sizes and issue widths in new microprocessor designs, designers will need t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Instruction-Level Parallelism

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2004